17,878 results on '"ENCYCLOPEDIAS & dictionaries"'
Search Results
2. The Effects of Sentiment Evolution in Financial Texts: A Word Embedding Approach.
- Author
-
Zheng, Jiexin, Ng, Ka Chung, Zheng, Rong, and Tam, Kar Yan
- Subjects
ACCOUNTING fraud ,STRATEGIC communication ,ENCYCLOPEDIAS & dictionaries ,FORENSIC accounting ,TELECONFERENCING - Abstract
We examine the evolutionary effects of sentiment words in financial text and their implications for various business outcomes. We propose an algorithm called Word List Vector for Sentiment (WOLVES) that leverages both a human-defined sentiment word list and the word embedding approach to quantify text sentiment over time. We then apply WOLVES to investigate the evolutionary effects of the most popular financial word list, Loughran and McDonald (LM) dictionary, in annual reports, conference calls, and financial news. We find that LM negative words become less negative over time in annual reports compared to conference calls and financial news, while LM positive words remain qualitatively unchanged. This finding reconciles with existing evidence that negative words are more subject to managers' strategic communication. We also provide practical implications of WOLVES by correlating the sentiment evolution of LM negative words in annual reports with market reaction, earnings performance, and accounting fraud. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Oxford Dictionary of the Christian Church.
- Author
-
Platten, Stephen
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *RELIGIONS , *CHRISTIAN ethics - Abstract
The fourth edition of the Oxford Dictionary of the Christian Church, edited by Andrew Louth, is a significant event in the field. The first two editions were influenced by Anglicanism, but the third edition reflected changing theological climates. The current edition expands the number of articles on Christianity and includes traditional churches, new movements, leaders, and theological concerns. The edition also addresses contemporary issues and includes over 480 contributors and 6500 entries. While there may be some subjective decisions on what to include, overall, the edition maintains a sense of balance and proportion. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
4. A New Pipeline Condition Assessment Method Using a Multicomponent Interferometric Dictionary for Quantification of Pipeline Notches.
- Author
-
Zang, Xulei, Xu, Zhao-Dong, Zhu, Chen, Lu, Hongfang, Peng, Haoyan, and Zhang, Zhenwu
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *NOTCH effect , *WAVE packets , *WAVEGUIDES , *SIGNAL-to-noise ratio , *PIPELINES , *ALUMINUM - Abstract
This paper proposes an improved matching pursuit (MP) method with a multicomponent interference dictionary (MCID) based on a pipeline guided wave reflection model. The proposed method separates the overlapped time domain reflections from the notch edges and extracts parameters such as amplitude, time of flight (TOF), and phase to identify the number and axial dimensions of notches in pipes. Firstly, the performance of the method in identifying parameters of overlapped components affected by noise and reverberation under different signal-to-noise ratios (SNRs) and frequencies is analyzed using four error metrics. Secondly, finite-element (FE) models of pipes with a single notch and double notches are established, and the accuracy of the reflection model is validated by comparing the predicted amplitudes, TOF, and phase of reflections with the FE results. Finally, experimental validation is conducted on aluminum pipes with multiple notches. The consistency between the experimental results, theoretical results, and FE results in wave packet parameters confirmed the accuracy of the proposed reflection models. The method accurately captures the parameters of the reflections from each notch edge in the experimental pipes, enabling the identification of the number and axial dimensions of the notches. The method holds the potential for identifying a greater number of notches in pipelines and characterizing their axial dimensions. This can provide a reliable reference for developing maintenance plans based on pipeline operational conditions, thereby preventing major safety incidents. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. CIDER: Context-sensitive polarity measurement for short-form text.
- Author
-
Young, James C., Arthur, Rudy, and Williams, Hywel T. P.
- Subjects
- *
SENTIMENT analysis , *LINGUISTIC context , *HEADLINES , *LINGUISTIC analysis , *CIDER (Alcoholic beverage) , *ENCYCLOPEDIAS & dictionaries - Abstract
Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Space-efficient computation of k-mer dictionaries for large values of k.
- Author
-
Díaz-Domínguez, Diego, Leinonen, Miika, and Salmela, Leena
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *SIGNS & symbols , *DNA , *COMPUTERS , *MEMORY - Abstract
Computing k-mer frequencies in a collection of reads is a common procedure in many genomic applications. Several state-of-the-art k-mer counters rely on hash tables to carry out this task but they are often optimised for small k as a hash table keeping keys explicitly (i.e., k-mer sequences) takes O (N k w) computer words, N being the number of distinct k-mers and w the computer word size, which is impractical for long values of k. This space usage is an important limitation as analysis of long and accurate HiFi sequencing reads can require larger values of k. We propose Kaarme, a space-efficient hash table for k-mers using O (N + u k w) words of space, where u is the number of reads. Our framework exploits the fact that consecutive k-mers overlap by k - 1 symbols. Thus, we only store the last symbol of a k-mer and a pointer within the hash table to a previous one, which we can use to recover the remaining k - 1 symbols. We adapt Kaarme to compute canonical k-mers as well. This variant also uses pointers within the hash table to save space but requires more work to decode the k-mers. Specifically, it takes O (σ k) time in the worst case, σ being the DNA alphabet, but our experiments show this is hardly ever the case. The canonical variant does not improve our theoretical results but greatly reduces space usage in practice while keeping a competitive performance to get the k-mers and their frequencies. We compare canonical Kaarme to a regular hash table storing canonical k-mers explicitly as keys and show that our method uses up to five times less space while being less than 1.5 times slower. We also show that canonical Kaarme uses significantly less memory than state-of-the-art k-mer counters when they do not resort to disk to keep intermediate results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Du Fu's conspicuous negativity and Li Bai's hidden positivity: a sentiment comparison and exploration.
- Author
-
Meng, Yingying, Wan, Yuwei, and Kit, Chunyu
- Subjects
- *
POETRY collections , *CHINESE poetry , *OPTIMISM , *SENTIMENT analysis , *ENCYCLOPEDIAS & dictionaries - Abstract
In the studies of classical Chinese poetry, the comparison between Li Bai and Du Fu is an everlasting topic, yielding many qualitative interpretations, among which a widely known but disputable one is Li's positivity versus Du's negativity. With the development of digital means, distant reading has become possible, and the sentiment issue can be further explored in quantitative ways. This research conducts a corpus-based sentiment comparison of Li and Du with a self-constructed sentiment dictionary. The Complete Collection of Tang Poems is used as a representative of Tang poets, and sentiment comparisons are made at the levels of poems, verses, and characters, as well as key characters extracted with the log-likelihood measure. Analyses show that (1) among Tang poets, Du is more negative at all of the above textual levels, while Li is only more positive at the key character level, proving the importance of key characters in readers' perception of sentiment; (2) Li and Du both stand out among Tang poets with a negative depiction of the dark reality and a positive expression of grand ideals; and (3) Li's positivity is largely embodied in his depictions of color, light, and temperature, while Du's negativity is closely related to his psychological description. To conclude, this research has not only determined the sentiment difference between Li and Du but also located its sources in texts with a novel key character-based sentiment analysis approach. Keywords: [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Sharp Bounds on the Approximation Rates, Metric Entropy, and n-Widths of Shallow Neural Networks.
- Author
-
Siegel, Jonathan W. and Xu, Jinchao
- Subjects
- *
FUNCTIONS of bounded variation , *TOPOLOGICAL entropy , *ENTROPY , *ENCYCLOPEDIAS & dictionaries - Abstract
In this article, we study approximation properties of the variation spaces corresponding to shallow neural networks with a variety of activation functions. We introduce two main tools for estimating the metric entropy, approximation rates, and n-widths of these spaces. First, we introduce the notion of a smoothly parameterized dictionary and give upper bounds on the nonlinear approximation rates, metric entropy, and n-widths of their absolute convex hull. The upper bounds depend upon the order of smoothness of the parameterization. This result is applied to dictionaries of ridge functions corresponding to shallow neural networks, and they improve upon existing results in many cases. Next, we provide a method for lower bounding the metric entropy and n-widths of variation spaces which contain certain classes of ridge functions. This result gives sharp lower bounds on the L 2 -approximation rates, metric entropy, and n-widths for variation spaces corresponding to neural networks with a range of important activation functions, including ReLU k activation functions and sigmoidal activation functions with bounded variation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Encyclopaedias in newspapers in British colonial America and the early United States.
- Author
-
Loveland, Jeff
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *NEWSPAPERS , *DATABASES , *HISTORY of the book ,BRITISH colonies - Abstract
In this article, relying on Readex's electronic database 'Early American Newspapers, Series 1, 1690–1876: From Colonies to Nation', I use the evidence of newspapers to construct a picture of the encyclopaedias most mentioned in British North America and the early United States during the period before 1790. I begin by explaining my methodology and reviewing other approaches to studying the American market for encyclopaedias before 1790. Then, working on the assumption that mentions of encyclopaedias in newspapers roughly represent interest in encyclopaedias, I attempt to quantify Americans' evolving interest in different encyclopaedias and kinds of encyclopaedias. Lastly, having sketched out the landscape of encyclopaedias in eighteenth-century America, I consider the uses to which they were put. Among other conclusions, I argue that, despite assertions to the contrary by encyclopaedists and their publishers, encyclopaedias functioned less as substitutes for private libraries than as complements to them. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Lamb mode and damage identification using small-sample dictionary algorithm.
- Author
-
Li, Juanjuan
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *LAMB waves , *LAMBS , *WAVE packets , *ALGORITHMS , *IDENTIFICATION - Abstract
In this paper, Lamb mode identification method based on small-sample dictionary algorithm is proposed and applied for the separation of specific Lamb modes, the reconstruction of Lamb waves upon propagating a certain distance and damage identification. This approach includes the creation of small-sample dictionary and querying procession in a dictionary. Firstly, Lamb wave signals upon propagating at a series of distances are simulated, and signal features, {mode, distance, time of flight (Tof), wavelet energy}, are extracted to create a dictionary; secondly, Tof of the received signal is extracted, and then Lamb modes are identified by searching the dictionary; finally, energy parameters are estimated to reconstruct wavepackets. The feasibility of this algorithm is validated in AAA laminate, and the results are presented. In a 2D-simulation model of a pitch-catch configuration, A0 and S0 modes can be identified and reconstructed effectively when the direct waves and the reflected waves are synchronously received, with the propagation distance of 0.3 m and 0.5 m, respectively. In addition, a Lamb-wave-based delamination location is conducted in three-dimensional AAA laminate. The experimental results show that the delamination can be located relatively by combining the identified damage-scattered S0 waves and the probability-based diagnostic imaging. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. A Text-Based Measure of Transactive Memory System Strength.
- Author
-
Kush, Jonathan, Aven, Brandy, and Argote, Linda
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *MEMORY , *TERMS & phrases - Abstract
We develop a method to assess the three indicators of transactive memory systems (TMS)—specialization, credibility, and coordination—through computer-aided text analysis. First, human coders assessed group transcripts for phrases representative of these indicators. From those phrases, we identified words that occurred frequently to develop a dictionary of TMS indicators. In total, we analyzed 262 groups composed of 1,091 individuals. Both the human-coded and dictionary-based assessments of TMS indicators are significantly related to a popular survey-based assessment of TMS. Our approach could be used to advance understanding of TMS by analyzing it in contexts where administering surveys is not feasible. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Chinese text dual attention network for aspect-level sentiment classification.
- Author
-
Sun, Xinjie, Liu, Zhifang, Li, Hui, Ying, Feng, and Tao, Yu
- Subjects
- *
CHINESE language , *SENTIMENT analysis , *ENCYCLOPEDIAS & dictionaries , *ENGLISH language , *PROBLEM solving - Abstract
English text has a clear and compact subject structure, which makes it easy to find dependency relationships between words. However, Chinese text often conveys information using situational settings, which results in loose sentence structures, and even most Chinese comments and experimental summary texts lack subjects. This makes it challenging to determine the dependency relationship between words in Chinese text, especially in aspect-level sentiment recognition. To solve this problem faced by Chinese text in the field of sentiment recognition, a Chinese text dual attention network for aspect-level sentiment recognition is proposed. First, Chinese syntactic dependency is proposed, and sentiment dictionary is introduced to quickly and accurately extract aspect-level sentiment words, opinion extraction and classification of sentimental trends in text. Additionally, in order to extract context-level features, the CNN-BILSTM model and position coding are also introduced. Finally, to better extract fine-grained aspect-level sentiment, a two-level attention mechanism is used. Compared with ten advanced baseline models, the model's capabilities are being further optimized for better performance, with Accuracy of 0.9180, 0.9080 and 0.8380 respectively. This method is being demonstrated by a vast array of experiments to achieve higher performance in aspect-level sentiment recognition in less time, and ablation experiments demonstrate the importance of each module of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. مفتی احمد یار خان نعیمی کا مباحث تصوف میں درایتی اسلوب.
- Author
-
Ahmad, Ali, Naveed, Muhammad, and Ansar, Muhammad
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *SEMANTICS , *TRANSLATING & interpreting , *SCHOLARS , *SONS , *FOURTEENTH century - Abstract
Mufti Ahmad Yar Khan Naemī was a Famous scholar of 14th century of Hijra in the history of subcontnent. He was a great mofasir of Quran. He wrote Tafseer Naemī which he could not complete and he died during this great job. He wrote the tafseer of only 11 paras only. His son started to write it but he could complete only 20 paras and died. In this article Drayati Usloob of Mufti Ahmad Yar Khan has been discussed and analysed in the description of Tasawwuf. Mufti Ahmad Yar Khan has explained the verses in different categories. He narrates the meanings of a verse by Arabic dictionary, words and their meanings, translation of Ahmad Raza Khan brelwi, relation of the verse with earlier one, Shan e Nazool, Nahwi Tafseer, scholarly Tafseer, Fawaid e Ayat, Fiqhi Masail, Answers to different objections and Sufiana Tafseer. His Sufiana Tafseer has points to about tasawwuf and its terms. It can be concluded that he has also used Drāyat in elaborating the matters o Tasawwuf in his tafseer. [ABSTRACT FROM AUTHOR]
- Published
- 2024
14. The Morais Dictionary: Following Best Practices in a Retro-digitized Dictionary Project.
- Author
-
Salgado, Ana, Romary, Laurent, Costa, Rute, Tasovac, Toma, Khan, Anas Fahad, Ramos, Margarida, Almeida, Bruno, Carvalho, Sara, Khemakhem, Mohamed, Silva, Raquel, and LeheČka, Boris
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *BEST practices , *DIGITAL preservation , *METADATA - Abstract
This article outlines essential best practices for retro-digitized dictionary projects, using the ongoing MORDigital project (DOI 10.54499/PTDC/LLT-LIN/6841/2020) as a case study. The MORDigital project focuses on digitally transforming the historically significant Portuguese Morais dictionary's first three editions (1789, 1813, 1823). While the primary objective is to create faithful digital versions of these renowned dictionaries, MORDigital stands out by going beyond the mere adoption of established best practices. Instead, it reflects on the choices made throughout the process, providing insights into the decision-making process. The key topics emphasized include (1) the establishment of a robust data model; (2) the refinement of metadata; (3) the implementation of consistent identifiers; and (4) the enhancement of encoding techniques; additionally exploring the issue of structuring domain labelling. The article aims to contribute to the ongoing discourse on best practices in retro-digitized dictionary projects and their implications for data preservation and knowledge organization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A relational extraction approach based on multiple embedding representations and multi-head self-attention.
- Author
-
Qin, Zhi, Liu, Enyang, Zhang, Shibin, Chang, Yan, and Yan, Lili
- Subjects
- *
CONVOLUTIONAL neural networks , *ENCYCLOPEDIAS & dictionaries , *CHINESE language , *KNOWLEDGE base - Abstract
Currently, word segmentation errors and polysemy problems are common in the field of Chinese relationship extraction. Although character-based model input can avoid word segmentation errors, in order to obtain the word information of a sentence, it is often necessary to introduce a dictionary or an external knowledge base to expand the word information, which requires a lot of manpower and time. In response to the above existing problems, this article uses characters as input, uses multiple embedding models to jointly form a character vector sequence, and obtains features containing character information through BiLSTM and attention layers; considering that convolutional neural networks are good at extracting local features, obtain features containing word information through multi-kernel convolutional layers and multi-head self-attention layers, and finally use a gating mechanism to fuse the features. The model was tested on the public SanWen data set and our own cultural-travel data set, and obtained F1 values of 61.22% and 60.26% respectively. Experimental results show that our method can achieve better relationship extraction effects without using word segmentation tools and without building a dictionary or external knowledge base, and the effect is better than most commonly used models currently. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Domain-Specific Dictionary between Human and Machine Languages.
- Author
-
Islam, Md Saiful and Liu, Fei
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *PROGRAMMING languages , *KNOWLEDGE graphs , *QUESTION answering systems , *DATA mining , *ARTIFICIAL intelligence - Abstract
In the realm of artificial intelligence, knowledge graphs have become an effective area of research. Relationships between entities are depicted through a structural framework in knowledge graphs. In this paper, we propose to build a domain-specific medicine dictionary (DSMD) based on the principles of knowledge graphs. Our dictionary is composed of structured triples, where each entity is defined as a concept, and these concepts are interconnected through relationships. This comprehensive dictionary boasts more than 348,000 triples, encompassing over 20,000 medicine brands and 1500 generic medicines. It presents an innovative method of storing and accessing medical data. Our dictionary facilitates various functionalities, including medicine brand information extraction, brand-specific queries, and queries involving two words or question answering. We anticipate that our dictionary will serve a broad spectrum of users, catering to both human users, such as a diverse range of healthcare professionals, and AI applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Detecting Moral Features in TV Series with a Transformer Architecture through Dictionary-Based Word Embedding.
- Author
-
Fantozzi, Paolo, Rotondi, Valentina, Rizzolli, Matteo, Dalla Torre, Paola, and Naldi, Maurizio
- Subjects
- *
DEEP learning , *TELEVISION series , *MORAL foundations theory , *TRANSFORMER models , *ENCYCLOPEDIAS & dictionaries - Abstract
Moral features are essential components of TV series, helping the audience to engage with the story, exploring themes beyond sheer entertainment, reflecting current social issues, and leaving a long-lasting impact on the viewers. Their presence shows through the language employed in the plot description. Their detection helps regarding understanding the series writers' underlying message. In this paper, we propose an approach to detect moral features in TV series. We rely on the Moral Foundations Theory (MFT) framework to classify moral features and use the associated MFT dictionary to identify the words expressing those features. Our approach combines that dictionary with word embedding and similarity analysis through a deep learning SBERT (Sentence-Bidirectional Encoder Representations from Transformers) architecture to quantify the comparative prominence of moral features. We validate the approach by applying it to the definition of the MFT moral feature labels as appearing in general authoritative dictionaries. We apply our technique to the summaries of a selection of TV series representative of several genres and relate the results to the actual content of each series, showing the consistency of results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Incorporating network diffusion and peak location information for better single-cell ATAC-seq data analysis.
- Author
-
Yu, Jiating, Leng, Jiacheng, Hou, Zhichao, Sun, Duanchen, and Wu, Ling-Yun
- Subjects
- *
GENETIC transcription regulation , *GRAPH theory , *ENCYCLOPEDIAS & dictionaries , *GENOMES , *EPIGENETICS - Abstract
Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data provided new insights into the understanding of epigenetic heterogeneity and transcriptional regulation. With the increasing abundance of dataset resources, there is an urgent need to extract more useful information through high-quality data analysis methods specifically designed for scATAC-seq. However, analyzing scATAC-seq data poses challenges due to its near binarization, high sparsity and ultra-high dimensionality properties. Here, we proposed a novel network diffusion–based computational method to comprehensively analyze scATAC-seq data, named Single-Cell ATAC-seq Analysis via Network Refinement with Peaks Location Information (SCARP). SCARP formulates the Network Refinement diffusion method under the graph theory framework to aggregate information from different network orders, effectively compensating for missing signals in the scATAC-seq data. By incorporating distance information between adjacent peaks on the genome, SCARP also contributes to depicting the co-accessibility of peaks. These two innovations empower SCARP to obtain lower-dimensional representations for both cells and peaks more effectively. We have demonstrated through sufficient experiments that SCARP facilitated superior analyses of scATAC-seq data. Specifically, SCARP exhibited outstanding cell clustering performance, enabling better elucidation of cell heterogeneity and the discovery of new biologically significant cell subpopulations. Additionally, SCARP was also instrumental in portraying co-accessibility relationships of accessible regions and providing new insight into transcriptional regulation. Consequently, SCARP identified genes that were involved in key Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways related to diseases and predicted reliable cis -regulatory interactions. To sum up, our studies suggested that SCARP is a promising tool to comprehensively analyze the scATAC-seq data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Benchmarking enrichment analysis methods with the disease pathway network.
- Author
-
Buzzao, Davide, Castresana-Aguirre, Miguel, Guala, Dimitri, and Sonnhammer, Erik L L
- Subjects
- *
GENE expression , *NULL hypothesis , *ENCYCLOPEDIAS & dictionaries , *SENSITIVITY & specificity (Statistics) , *SYSTEMS biology - Abstract
Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P -values. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Antedating (in) the Oxford English Dictionary.
- Author
-
Williams, David-Antoine
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *BIBLIOGRAPHICAL citations , *BIBLIOGRAPHY , *BIBLIOGRAPHIC databases - Abstract
The article presents comparative analyses of the 1989 and 2022 revised editions of the "Oxford English Dictionary," focusing on the antedating of entries. It discusses the incidence of revised earliest dates of bibliographical adjustments, number of entries revised with near antedates, longer antedatings, antedatings attributed to the availability of high-quality historical text databases, and revisions to the 1989 edition earliest citations by quarterly update and original earliest date.
- Published
- 2024
- Full Text
- View/download PDF
21. Evaluation of the Brazilian Portuguese version of linguistic inquiry and word count 2015 (BP-LIWC2015).
- Author
-
Carvalho, Flavio, Junior, Fabio Paschoal, Ogasawara, Eduardo, Ferrari, Lilian, and Guedes, Gustavo
- Subjects
- *
PORTUGUESE language , *WORD frequency , *ENCYCLOPEDIAS & dictionaries , *PSYCHOLINGUISTICS , *CLASSIFICATION algorithms - Abstract
Text psycholinguistic features are a valuable source for various research topics since they are used to obtain psychological, social, and linguistic aspects from written texts using dictionary files. These files are structured in categories, which are defined as groups of dictionary words that tap a particular domain (e.g., negative emotion words). The Linguistic Inquiry Word Count (LIWC) is a vastly used and versatile computer-based language analysis tool designed for text psycholinguistic analysis. The most recent version of the default English dictionary is LIWC2015, as it was released with the 2015 version of the LIWC software. The literature has recently introduced the latest Brazilian Portuguese LIWC dictionary (BP-LIWC2015), developed with the same categories as the LIWC 2015 English dictionary. However, the literature has also reported the need to evaluate BP-LIWC2015. In this scenario, this work investigates three questions: (i) Since LIWC2015 shows consistent improvements over the English dictionary developed in 2007 (LIWC2007), does BP-LIWC2015 achieves better text classification results than the older Brazilian Portuguese dictionary (BP-LIWC2007)? (ii) What is the equivalence between BP-LIWC2015 and BP-LIWC2007 with LIWC2015? (iii) Are there significant differences between Brazilian Portuguese dictionaries? To answer these questions, we conducted text classification experiments with four datasets and seven classification algorithms to compare the two Brazilian Portuguese LIWC dictionaries reported in the literature (i.e., 2007 and 2015). Second, we used a bilingual Portuguese-English scientific news collection to analyze the correlation between LIWC2015 and Brazilian Portuguese LIWC dictionaries. The results indicate that BP-LIWC2015 outperforms the older version in Brazilian Portuguese text classification. Finally, we found a more significant correlation between BP-LIWC2015 and the original English dictionary than the older version. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Low-Rank Regression Models for Multiple Binary Responses and their Applications to Cancer Cell-Line Encyclopedia Data.
- Author
-
Park, Seyoung, Lee, Eun Ryung, and Zhao, Hongyu
- Subjects
- *
REGRESSION analysis , *ENCYCLOPEDIAS & dictionaries , *NONLINEAR regression , *LOW-rank matrices , *ISING model , *LOGISTIC regression analysis - Abstract
In this article, we study high-dimensional multivariate logistic regression models in which a common set of covariates is used to predict multiple binary outcomes simultaneously. Our work is primarily motivated from many biomedical studies with correlated multiple responses such as the cancer cell-line encyclopedia project. We assume that the underlying regression coefficient matrix is simultaneously low-rank and row-wise sparse. We propose an intuitively appealing selection and estimation framework based on marginal model likelihood, and we develop an efficient computational algorithm for inference. We establish a novel high-dimensional theory for this nonlinear multivariate regression. Our theory is general, allowing for potential correlations between the binary responses. We propose a new type of nuclear norm penalty using the smooth clipped absolute deviation, filling the gap in the related non-convex penalization literature. We theoretically demonstrate that the proposed approach improves estimation accuracy by considering multiple responses jointly through the proposed estimator when the underlying coefficient matrix is low-rank and row-wise sparse. In particular, we establish the non-asymptotic error bounds, and both rank and row support consistency of the proposed method. Moreover, we develop a consistent rule to simultaneously select the rank and row dimension of the coefficient matrix. Furthermore, we extend the proposed methods and theory to a joint Ising model, which accounts for the dependence relationships. In our analysis of both simulated data and the cancer cell line encyclopedia data, the proposed methods outperform the existing methods in better predicting responses. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Enhancing Transparency in Defining Studied Drugs: The Open-Source Living DiAna Dictionary for Standardizing Drug Names in the FAERS.
- Author
-
Fusaroli, Michele, Giunchi, Valentina, Battini, Vera, Puligheddu, Stefano, Khouri, Charles, Carnovale, Carla, Raschi, Emanuel, and Poluzzi, Elisabetta
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *RESEARCH personnel , *MEDICATION safety , *DRUGS - Abstract
Introduction: In refining drug safety signals, defining the object of study is crucial. While research has explored the effect of different event definitions, drug definition is often overlooked. The US FDA Adverse Event Reporting System (FAERS) records drug names as free text, necessitating mapping to active ingredients. Although pre-mapped databases exist, the subjectivity and lack of transparency of the mapping process lead to a loss of control over the object of study. Objective: We implemented the DiAna dictionary, systematically mapping individual free-text instances to their corresponding active ingredients and linking them to the World Health Organization Anatomical Therapeutic Chemical (WHO-ATC) classification. Methods: We retrieved all drug names reported to the FAERS (2004–December 2022). Using existing vocabularies and string editing, we automatically mapped free text to ingredients. We manually revised the mapping and linked it to the ATC classification. Results: We retrieved 18,151,842 reports, with 74,143,411 drug entries. We manually checked the first 14,832 terms, up to terms occurring over 200 times (96.88% of total drug entries), to 6282 unique active ingredients. Automatic unchecked translations extend the standardization to 346,854 terms (98.94%). The DiAna dictionary showed a higher sensitivity compared with RxNorm alone, particularly for specific drugs (e.g., rimegepant, adapalene, drospirenone, umeclidinium). The most prominent drug classes in the FAERS were immunomodulating (37.40%) and neurologic drugs (29.19%). Conclusion: The DiAna dictionary, as a dynamic open-source tool, provides transparency and flexibility, enabling researchers to actively shape drug definitions during the mapping phase. This empowerment enhances accuracy, reproducibility, and interpretability of results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Moral foundations and juror verdict justifications.
- Author
-
Yamamoto, Susan, Maeder, Evelyn M, and Bailey, Lin
- Subjects
- *
JURORS , *VERDICTS , *MORAL foundations theory , *ENCYCLOPEDIAS & dictionaries , *CRIME - Abstract
The purpose of this study was to examine the ways in which mock jurors justified their verdict decisions using moral foundations language. Participants read a trial transcript describing a second-degree murder charge featuring an automatism plea (which negates the physical volition of a crime). They then provided a two-to-three sentence rationale for their verdict choice, which we coded for the contextually-valid presence of words from the Moral Foundations (MF) Dictionary. Mock jurors were most likely to use harm-related language in justifying murder votes. A qualitative description also revealed differences in the content of the justifications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Transport terminology translation variance model.
- Author
-
Dmitrieva, E., Grubin, I., Pluzhnikova, I., Stekolschikova, I., and Yudina, I.
- Subjects
- *
TERMS & phrases , *ENCYCLOPEDIAS & dictionaries - Abstract
The goal of the paper is to draw up a transport terminology translation variance model basing on the data of Russian-English and English-Russian translations. The urgency of the paper is backed by the fact that transportation is one of the sectors where exact translation can be crucial. The following methods were used in the research: the analysis of dictionary entries, statistical methods, classification and modelling. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Multiview feature co‐factorization based dictionary selection for video summarization.
- Author
-
Chen, Xiaoning, Ma, Mingyang, Yang, Runfeng, and Peng, Yong
- Subjects
- *
VIDEO summarization , *ENCYCLOPEDIAS & dictionaries , *MATRIX decomposition , *VIDEO processing , *VIDEO coding - Abstract
Recently, video summarization (VS) has emerged as one of the most effective tools for rapidly understanding video big data. Dictionary selection based on self‐representation and sparse regularization is consistent with the requirement of VS, which aims to represent the original video with little reconstruction error by a small number of video frames. However, one crucial issue is that the existing methods mainly use a single view feature, which is not sufficient enough for acquiring the full pictorial details and affects the quality of the produced video summary. Although a few methods use more than one features, they only directly concatenate the features, which does not take advantage of the relationship of different features. Considering the complementarity of shallow and deep features, multiview feature co‐factorization based dictionary selection for VS is proposed in this paper to use the common information of both view features for VS. Specifically, two view features are used to fully exploit the full pictorial information of video frames, then the common information of two different views is extracted through coupled matrix factorization to conduct the dictionary selection for VS. Experiments have been carried out on two benchmark datasets, and results have demonstrated the effectiveness and superiority of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Comparison of Purely Greedy and Orthogonal Greedy Algorithm.
- Author
-
Vishnevetskiy, K. S.
- Subjects
- *
HILBERT space , *ENCYCLOPEDIAS & dictionaries , *GREEDY algorithms , *COINCIDENCE - Abstract
Conditions for a dictionary in a Hilbert space are obtained that are necessary or sufficient for the coincidence of purely greedy and orthogonal greedy algorithms with respect to this dictionary. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Celestial two-point functions and rectified dictionary.
- Author
-
Furugori, Hideo, Ogawa, Naoki, Sugishita, Sotaro, and Waki, Takahiro
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *CONFORMAL field theory - Abstract
A naive celestial dictionary causes massless two-point functions to take the delta-function forms in the celestial conformal field theory (CCFT). We rectify the dictionary, involving the shadow transformation so that the two-point functions follow the standard power-law. In this new definition, we can smoothly take the massless limit of the massive dictionary. We also compute a three-point function using the new dictionary and discuss the OPE in CCFT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Matrix model correlators from non-Abelian T-dual of AdS5 × S5.
- Author
-
Roychowdhury, Dibakar
- Subjects
- *
CORRELATORS , *SCALAR field theory , *CHEMICAL potential , *VECTOR fields , *MATRICES (Mathematics) , *ENCYCLOPEDIAS & dictionaries - Abstract
We study various perturbations and their holographic interpretation for non-Abelian T-dual of AdS5 × S5 where the T-duality is applied along the SU(2) of AdS5. This paper focuses on two types of perturbations, namely the scalar and the vector fields on NATD of AdS5 × S5. For scalar perturbations, the corresponding solutions could be categorised into two classes. For one of these classes of solutions, we build up the associated holographic dictionary where the asymptotic radial mode sources scalar operators for the (0 + 1)d matrix model. These scalar operators correspond to either a marginal or an irrelevant deformation of the dual matrix model at strong coupling. We calculate the two point correlation between these scalar operators and explore their high as well as low frequency behaviour. We also discuss the completion of these geometries by setting an upper cut-off along the holographic axis and discuss the corresponding corrections to the scalar correlators in the dual matrix model. Finally, we extend our results for vector perturbations where we obtain asymptotic solutions for a particular class of modes. These are further used to calculate the boundary charge density at finite chemical potential. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. The Evolution of Our Approach to History Education Using Wikipedia.
- Author
-
Bawden, John R. and Hultquist, Clark E.
- Subjects
- *
HISTORY education in universities & colleges , *DIGITAL natives , *ENCYCLOPEDIAS & dictionaries - Abstract
The article explores the use of Wikipedia in teaching history in the U.S. Topics include why Wikipedia is the world's most visited website, the lack of knowledge of digital natives about the process by which Wikipedia's articles are written and revised, a discussion of an assignment created at a University of Montevallo history class comparing traditional encyclopedia entries to their Wikipedia equivalents, and an overview of a Digital History elective course created at the university in 2014.
- Published
- 2024
31. Typological shift of Mandarin Chinese in terms of motion verb lexicalization pattern.
- Author
-
Linjun, Liu and Yingxin, He
- Subjects
- *
MANDARIN dialects , *CHINESE language , *VERBS , *MOTION analysis , *ENCYCLOPEDIAS & dictionaries - Abstract
Given the controversies over Mandarin Chinese in terms of Talmy's bipartite language typology, this paper presents an exhaustive study of Chinese motion verbs collected from two authoritative dictionaries, namely, The Ancient Chinese Dictionary (2nd Edition) and The Contemporary Chinese Dictionary (7th Edition). An analysis of 662 motion verbs in ancient Chinese and 693 motion verbs in modern Chinese indicates that Mandarin Chinese has undergone a typological shift from verb-framed to satellite-framed as far as the lexicalization pattern is concerned. The typological shift seems to have been driven by two forces, the decline of monosyllabic motion verbs and the upsurge of disyllabic motion verbs, which, upon second thoughts, can be boiled down to a single but predominant process of disyllabification in Chinese, whereby two (former) roots that bear a wide range of syntactic relations are lexicalized into a disyllabic word. Thus, we see an intriguing case of how phonology and morphosyntax interact to impact the typological properties of a language. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. The Promotion of Traditional Values through Films and Television Programmes: The Moscow Patriarchate and the Orthodox Encyclopaedia Project (2005–2022).
- Author
-
Napolitano, Marianna
- Subjects
- *
INSTITUTIONAL environment , *RUSSIAN films , *VALUES (Ethics) , *TELEVISION programmers & programming , *ENCYCLOPEDIAS & dictionaries , *NEWS websites , *PUBLIC opinion polls , *CULTURAL values - Abstract
On 26 May 2011, the Russian People's World Council issued a document entitled The Basic Values: The Fundaments of National Unity. The document, prepared by the Synodal Department for Church–Society Cooperation, provided a catalogue of 17 traditional values whose general framework was constituted by a combination of freedom, unity, patriotism, family, and devotion. At that time, the Moscow Patriarchate considered religious faith to be the foundation of traditional values and it continues to do so. The defence and promotion of traditional Russian spiritual and moral values were also central in the Russian National Security Strategy (2015); this was the case in the updated version of this document as well, put out in July 2021. Furthermore, they have been the core of the Moscow Patriarchate's participation in the Council of Europe and of Patriarch Kirill's speeches about the war in Ukraine. Finally, on 9 November 2022, The Foundations Of State Policy For The Preservation Of Spiritual And Moral Values was approved. This framework permits us to understand the strict interplay between the Church and the State in the Russian Federation and to see why it is important to refer to the concept of post-secularism when talking about the role of religion in post-Soviet Russia. Proceeding from the Abstract, the present paper aims to analyse this interplay in a specific sector of visual culture: the cinema and television industries. Manuel Castells highlighted the relevance of cultural values in the age of information and the connection between the values and social mobilization that follows it. He pointed out that the Internet has become a way to render this connection predominant, inevitably leading to the development of social movements and networks that have a religious basis. This is unquestionably true; surveys conducted by the Russian Public Opinion Research Center (OJSC «VCIOM») and by Nevafilm Research confirm that a high percentage of Russians watch films not only at the cinema or on television (especially the older generations) but also on the Internet (as far as the younger generations are concerned). The importance of this market is also confirmed by the success of the cinema and TV distributor Orthodox Encyclopaedia (2005); in the words of the philosopher Sergei Kravets, who, commenting on it during an interview published in 2006 by the website Sedmits.ru, declared that the expression "orthodox cinema" can be understood as a way to express Russian culture. He asserts that "the fact that today Orthodox films have begun to appear on the central TV channels testifies that Russian film producers and viewers have apparently begun to be aware of themselves as Orthodox, to feel that they are bearers of a special Orthodox culture. [..]". At the same time, consideration should be given to the importance of the Russian Orthodox Church and the Minister of Culture's condemnation of films such as Matilda or Monastery. In addition, it is important to consider that, according to a survey conducted in 2022 by the Levada Center, Russian people consider television the most reliable source of information (54%). The long-term implications of this tendency may have very important effects, not only in terms of its objectives but also in terms of the consideration that, after the beginning of the war, many Western film distributors withdrew their licenses from Russia. This paper will analyse "the effect of religion on the institutional system, the regulatory environment of the media and the public sphere" by studying the features of films and TV programs distributed by Orthodox Encyclopaedia, their relations with traditional values promoted both by the Kremlin and the Church, how these have contributed to strengthening the interplay between the Minister of Culture and the Moscow Patriarchate, and the impact this process has had on Russian society and Russia's relations with the European and Western World in the 2005–2022 period. A list of the films and TV programs being discussed will be provided, and then statements about the project and reviews of the serials and films will be analysed. The analysis will be conducted mainly through the official sites of the Russian Orthodox Church and the Kremlin and by browsing the Integrum database. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. The Curious Layperson: Fine-Grained Image Recognition Without Expert Labels.
- Author
-
Choudhury, Subhabrata, Laina, Iro, Rupprecht, Christian, and Vedaldi, Andrea
- Subjects
- *
IMAGE recognition (Computer vision) , *KNOWLEDGE base , *IMAGE registration , *ENCYCLOPEDIAS & dictionaries , *LAYPERSONS , *OBJECT tracking (Computer vision) - Abstract
Most of us are not experts in specific fields, such as ornithology. Nonetheless, we do have general image and language understanding capabilities that we use to match what we see to expert resources. This allows us to expand our knowledge and perform novel tasks without ad-hoc external supervision. On the contrary, machines have a much harder time consulting expert-curated knowledge bases unless trained specifically with that knowledge in mind. Thus, in this paper we consider a new problem: fine-grained image recognition without expert annotations, which we address by leveraging the vast knowledge available in web encyclopedias. First, we learn a model to describe the visual appearance of objects using non-expert image descriptions. We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis. We evaluate the method on two datasets (CUB-200 and Oxford-102 Flowers) and compare with several strong baselines and the state of the art in cross-modal retrieval. Code is available at: https://github.com/subhc/clever. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Matching pursuit with unbounded parameter domains.
- Author
-
Qu, Wei, Wang, Yanbo, and Sun, Xiaoyun
- Subjects
- *
HILBERT space , *ENCYCLOPEDIAS & dictionaries , *HARDY spaces , *OPTICAL disks - Abstract
In various applications, the adoption of optimal energy matching pursuit with dictionary elements is common. When the dictionary elements are indexed by parameters within a bounded region, exhaustion-type algorithms can be employed. This article aims to investigate a process that converts the optimal parameter selection in unbounded regions to a bounded and closed (compact) sub-domain. Such a process provides accessibility for energy matching pursuit in a wide range of applications. The paper initially focuses on the open unit disc and the upper-half complex plane, introducing adaptive Fourier decomposition as the underlying methodology. It then extends this concept to general Hilbert spaces with a dictionary and Bochner spaces for random signals. Computational examples are included to illustrate the concepts presented. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Multi-mode dictionaries for fast CS-based dynamic MRI reconstruction.
- Author
-
Mubarak, Minha, Thomas, Thomas James, Rani J, Sheeba, and Mishra, Deepak
- Subjects
- *
MACHINE learning , *MAGNETIC resonance imaging , *COMPRESSED sensing , *FOUR-dimensional imaging , *ENCYCLOPEDIAS & dictionaries , *HUMAN physiology - Abstract
Dynamic Magnetic Resonance Imaging (dMRI) is a valuable tool for understanding changes in human physiology, but its temporal and spatial resolution can be limited. Compressed sensing (CS) has been used to enhance temporal resolution by acquiring partial k-spaces of each time frame and exploiting sparsity to retain spatial resolution. Invoking CS in dMRI necessitates algorithms that can leverage both spatial sparsity within each time frame and temporal sparsity across time frames. A tensor decomposition-based multi-mode dictionary learning algorithm has been proposed to learn the spatial and temporal features of dMRI data and reconstruct it more efficiently. The extensive quantitative simulations reveal the improvement induced by the proposed method in various settings compared to state-of-the-art methods in dMRI. Further, it considerably advances reconstruction speed from trained dictionaries over the state-of-the-art, permitting faster scans catering to a larger patient group. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. The Hegelian Art of the Table of Contents: On the logic, and tradition, of Hegel's organizational practices.
- Author
-
Kislev, S. F.
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *REFERENCE books , *CLASSIFICATION - Abstract
During the early 19th century, a peculiarly systematic way of organizing books emerged in Germany. This systematization, which purported to be a rational organization of subject matter, was an outgrowth of the philosophy of Hegel. This article attempts to outline Hegel's organizational practice. It argues that Hegel's encyclopedia was a reaction against the Enlightenment encyclopedia, and that it attempted to restore the systematic mindset of pre-modern reference books. Yet it did this, not in a straightforward fashion, but by developing a method for organizing material that both stresses the interconnectedness of knowledge and promises to be highly scalable. In its attempt to trace the Hegelian organizational method, this article revisits and calls attention to a forgotten work in Hegel scholarship by Michael John Petry, who was the first writer to seriously stress the importance of classification in Hegel's work. This article is an attempt to demonstrate that Hegelianism and information organization can be brought into fruitful and mutually beneficial dialogue. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Do we learn from mistakes? The usefulness of examples of errors in online dictionaries.
- Author
-
Dziemianko, Anna
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *ERROR correction (Information theory) , *MONOLINGUALISM , *ENGLISH language education , *ACCURACY - Abstract
The aim of this paper is to investigate the usefulness of examples that show typical learner errors in online pedagogical dictionaries of English for the accuracy of error correction as well as immediate and delayed retention of usage. The optimal positioning of examples of errors in entries is also researched. In an online experiment, participants did a sentence correction exercise with the help of purpose-built monolingual dictionary entries, where the provision and positioning of examples showing errors were controlled. Two test versions were created, which differed only in the presence of examples of errors in the entries. Usage retention was observed immediately after the test and two weeks later. The results indicate that it is worthwhile to include examples of errors in online learners' dictionaries because they contribute greatly to the retention of usage in the long run. They also help to rectify errors, though the effect is not statistically significant. The positioning of examples showing errors in entries has no influence on error correction accuracy or usage retention. The study reveals examples of errors to be a valuable induction-oriented stand-alone dictionary component placed outside warning boxes, which typically include explicit grammar rules and promote deduction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. A Tale of Sign Language Dictionary Making in the Netherlands.
- Author
-
Schermer, Trude
- Subjects
- *
SIGN language , *DEAF children , *ENCYCLOPEDIAS & dictionaries , *AMERICAN Sign Language , *TELECOMMUNICATION systems - Abstract
The article discusses the history of sign language dictionary making in the Netherlands. It begins with the author's introduction to the field of linguistics and their realization of the lack of sign language education for deaf children in the country. This led to a collaboration between the Dutch Foundation for the Deaf and Hard of Hearing Child, the University of Amsterdam, and the author's research on communication between hearing mothers and their deaf babies. The article then describes the KOMVA project, which aimed to inventory the signs used by deaf people in the Netherlands and compile a bilingual Dutch/NGT dictionary. The project resulted in the publication of the first national dictionary of NGT and the development of the Signbase database. The article also discusses the standardization of the NGT lexicon for educational purposes and the ongoing efforts to gain legal recognition for NGT as a language. The Dutch Sign Centre has played a significant role in spreading the NGT lexicon through various dictionaries and the online NGT dictionary. The article concludes with the recent recognition of NGT as a language by the Dutch government. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
39. RUSÇA DEYİM SÖZLÜKLERİNİN KURAMSAL VE UYGULAMA SORUNLARI.
- Author
-
DOHMAN, Ümmügülsüm
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *LEXICOGRAPHY , *IDIOMS - Abstract
The focus of this study is on the theoretical and practical problems related to Russian idiom dictionaries. The question of what kinds of theoretical and practical problems exist in the preparation of idiom dictionaries forms the basis of this research. The aim of this study is to examine the theoretical and practical problems encountered in the preparation of idiom dictionaries, and to evaluate selected idiom entries. The sources for this research include prominent Russian lexicography and phraseology studies, as well as Russian idiom dictionaries. In the study, the theoretical problems of idiom dictionaries, such as the selection of idioms, explanation criteria, the volume of information about idioms, the layout of the dictionary, the structure of the idiom item headings, and the presentation of examples related to the item headings are emphasized, and the approaches to solving these problems are examined by descriptive and comparative methods. In the findings section, the study examines the same idiom entries taken from various Russian dictionaries, including one explanatory dictionary and three idiom dictionaries, assessing their structure and content. After examining and assessing the dictionaries, recommendations were made regarding the limits and volume of the idiom dictionaries, as well as how idioms should be identified and included in the dictionary, and how to choose the idiom entry. It is considered that this study will offer a different perspective on the theoretical and practical problems of idiom dictionaries prepared not only in Russian but also in other languages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Derleme Sözlüğünde Geçen Akrabalık Adları.
- Author
-
TUĞLUK, Mehmet Emin
- Subjects
- *
PERSONAL names , *ENCYCLOPEDIAS & dictionaries , *DIALECTS - Abstract
The vocabulary of a language provides information about the lifestyles, traditions, customs, cultural values and attitudes towards life of the speakers of that language. Therefore, changes in the society and in the lives of the individuals who make up the society also affect the language. Therefore, the scarcity, abundance or frequency of use of concepts related to a concept in the language helps us to understand the place of these concepts in the social life of that society. Kinship is a concept that expresses the state of being close to each other through blood and marriage. Turkish society is a society based on kinship relations and has a structure that values kinship relations and adheres to cultural values. Therefore, in the social life of Turkish society, kinship terms have an important place in everyday language. A large number of kinship names are used in Standard Turkish and dialects of Turkish in Turkey. The Compilation Dictionary, which constitutes the dialects of Turkish dialects of Turkey, is the source of a rich vocabulary in terms of kinship names. Determining this vocabulary will contribute to the studies on the vocabulary of kinship in Turkish. In this study, the nouns related to kinship in the Compilation Dictionary are classified under three headings: words indicating kinship by blood, words indicating kinship by marriage and other words related to kinship. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Learning the sparse prior: Modern approaches.
- Author
-
Peng, Guan‐Ju
- Subjects
- *
RECURRENT neural networks , *SUPERVISED learning , *ENCYCLOPEDIAS & dictionaries , *PRIOR learning , *DATA science , *LEARNING problems - Abstract
The sparse prior has been widely adopted to establish data models for numerous applications. In this context, most of them are based on one of three foundational paradigms: the conventional sparse representation, the convolutional sparse representation, and the multi‐layer convolutional sparse representation. When the data morphology has been adequately addressed, a sparse representation can be obtained by solving the sparse coding problem specified by the data model. This article presents a comprehensive overview of these three models and their corresponding sparse coding problems and demonstrates that they can be solved using convex and non‐convex optimization approaches. When the data morphology is not known or cannot be analyzed, it must be learned from training data, thereby formulating dictionary learning problems. This article addresses two different dictionary learning paradigms. In an unsupervised scenario, dictionary learning involves the alternating or joint resolution of sparse coding and dictionary updating. Another option is to create a recurrent neural network by unrolling algorithms designed to solve sparse coding problems. These networks can then be used in a supervised learning setting to facilitate the training of dictionaries via forward‐backward optimization. This article lists numerous applications in various domains and outlines several directions for future research related to the sparse prior. This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Modeling MethodsStatistical and Graphical Methods of Data Analysis > Modeling Methods and AlgorithmsStatistical Models > Nonlinear Models [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Association measures for collocation extraction: Automatic evaluation on a large-scale corpus.
- Author
-
Su, Qi, Gu, Chen, and Liu, Pengyuan
- Subjects
- *
COLLOCATION (Linguistics) , *INFORMATION retrieval , *COLLOCATION methods , *ENCYCLOPEDIAS & dictionaries , *CORPORA - Abstract
In this study, we propose a new evaluation scheme to assess the strengths and limitations of collocation extraction measures and explore type-sensitive methods for extracting collocations. We introduced the pooling strategy widely used in Information Retrieval and automated the evaluation process using online dictionaries. Sixteen well-known metrics are evaluated based on their effectiveness and then distributional and linguistic compared. The results show that Group A methods (e.g. z-score, Dice, PMI) are more effective in extracting low-frequency collocations with relatively small extraction scales. In contrast, Group B methods (e.g. t-test, LMI, LLR) perform better at finding high-frequency collocations, most of which outperform Group A methods as the extraction scale increases. Moreover, Group A prefers NN collocations, while Group B identifies collocations with a wide range of syntactic structures. This study provides suggestions for studies to identify hybrid extraction methods as well as for language educators and dictionary compilers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Efficient fault-tolerant quantum dialogue protocols based on dictionary encoding without decoy photons.
- Author
-
Chang, Chen-Yu and Lin, Jason
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *PHOTONS , *QUANTUM states , *QUANTUM communication , *QUBITS - Abstract
This paper proposes two efficient block transmission and two efficient two-step transmission quantum dialogue (QD) protocols that are robust against collective-dephasing and collective-rotation noises, respectively. To counter collective noise, the carriers used to carry the message must correspond to the decoherence-free states under this collective noise. In addition to carrying messages, these quantum states and their combinations are used to ensure the security of transmission and prevent message distortion. In quantum communications, decoy photons are often used to detect eavesdroppers and always account for half of the total number of qubits, which is a burden on scarce quantum resources. Due to quantum state disclosure, the decoy photons used for inspection can no longer be utilized for transmission. Therefore, a dictionary is employed as an encoding mechanism to achieve self-checking without revealing these states. This approach enables photons to detect eavesdroppers while concurrently carrying keys. The proposed QD protocols can eliminate the consumption of decoy photons, significantly improve the qubit efficiency, and conduct security analysis to ensure that there is no information leakage. Two transmission modes with the same efficiency can be selected according to the noisy environment and time windows. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. KEGGSum: Summarizing Genomic Pathways.
- Author
-
David, Chaim and Kondylakis, Haridimos
- Subjects
- *
BIOLOGICAL databases , *DATABASES , *ONLINE databases , *ENCYCLOPEDIAS & dictionaries , *GENOMES - Abstract
Over time, the renowned Kyoto Encyclopedia of Genes and Genomes (KEGG) has grown to become one of the most comprehensive online databases for biological procedures. The majority of the data are stored in the form of pathways, which are graphs that depict the relationships between the diverse items participating in biological procedures, such as genes and chemical compounds. However, the size, complexity, and diversity of these graphs make them difficult to explore and understand, as well as making it difficult to extract a clear conclusion regarding their most important components. In this regard, we present KEGGSum, a system enabling the efficient and effective summarization of KEGG pathways. KEGGSum receives a KEGG identifier (Kid) as an input, connects to the KEGG database, downloads a specialized form of the pathway, and determines the most important nodes in the graph. To identify the most important nodes in the KEGG graphs, we explore multiple centrality measures that have been proposed for generic graphs, showing their applicability to KEGG graphs as well. Then, we link the selected nodes in order to produce a summary graph out of the initial KEGG graph. Finally, our system visualizes the generated summary, enabling an understanding of the most important parts of the initial graph. We experimentally evaluate our system, and we show its advantages and benefits. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Dictionary Encoding Based on Tagged Sentential Decision Diagrams.
- Author
-
Zhong, Deyuan, Fang, Liangda, and Guan, Quanlong
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *BOOLEAN functions , *ENCODING , *DECODING algorithms , *VIDEO coding - Abstract
Encoding a dictionary into another representation means that all the words can be stored in the dictionary in a more efficient way. In this way, we can complete common operations in dictionaries, such as (1) searching for a word in the dictionary, (2) adding some words to the dictionary, and (3) removing some words from the dictionary, in a shorter time. Binary decision diagrams (BDDs) are one of the most famous representations of such encoding and are widely popular due to their excellent properties. Recently, some people have proposed encoding dictionaries into BDDs and some variants of BDDs and showed that it is feasible. Hence, we further investigate the topic of encoding dictionaries into decision diagrams. Tagged sentential decision diagrams (TSDDs), as one of these variants based on structured decomposition, exploit both the standard and zero-suppressed trimming rules. In this paper, we first introduce how to use Boolean functions to represent dictionary files and then design an algorithm that encodes dictionaries into TSDDs with the help of tries and a decoding algorithm that restores TSDDs to dictionaries. We utilize the help of tries in the encoding algorithm, which greatly accelerates the encoding process. Considering that TSDDs integrate two trimming rules, we believe that using TSDDs to represent dictionaries would be more effective, and the experiments also show this. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. ADMM optimizer for integrating wavelet-patch and group-based sparse representation for image inpainting.
- Author
-
Arya, Amit Soni, Saha, Akash, and Mukhopadhyay, Susanta
- Subjects
- *
PIXELS , *INPAINTING , *IMAGE representation , *PRINCIPAL components analysis , *SIGNAL-to-noise ratio , *ENCYCLOPEDIAS & dictionaries - Abstract
Recovery or filling in of missing pixels in damaged images is a challenging problem known as image inpainting. Many currently used techniques still suffer from artifacts and other visual defects. In the proposed inpainting approach, the authors have combined wavelet patch-based and group-based sparse representation learning so as to exploit the benefits of (a) multiresolution decomposition using wavelets, (b) sparsity and (c) coherence. The proposed method creates multiple dictionaries employing adaptive K-SVD (K-singular value decomposition) on wavelet decomposed components. The method also creates another dictionary employing PCA (principal component analysis) on group-based image patches. Finally, to accomplish the operation of inpainting, dictionaries of both types are integrated using the ADMM (alternating direction method of multipliers). The proposed method has been tested on images with varying degrees of degradation in terms of the percentage of missing pixels or blocks. We have rated the performance and compared proposed method with other state-of-the-art inpainting methods based on measures like the peak signal-to-noise ratio, the structural similarity index measure, and the figure of merit. The high values of the performance measures establish the efficacy of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Multi-Dimensional Low-Rank with Weighted Schatten p -Norm Minimization for Hyperspectral Anomaly Detection.
- Author
-
Chen, Xi'ai, Wang, Zhen, Wang, Kaidong, Jia, Huidi, Han, Zhi, and Tang, Yandong
- Subjects
- *
INTRUSION detection systems (Computer security) , *ENCYCLOPEDIAS & dictionaries , *MATHEMATICAL regularization , *PROBLEM solving - Abstract
Hyperspectral anomaly detection is an important unsupervised binary classification problem that aims to effectively distinguish between background and anomalies in hyperspectral images (HSIs). In recent years, methods based on low-rank tensor representations have been proposed to decompose HSIs into low-rank background and sparse anomaly tensors. However, current methods neglect the low-rank information in the spatial dimension and rely heavily on the background information contained in the dictionary. Furthermore, these algorithms show limited robustness when the dictionary information is missing or corrupted by high level noise. To address these problems, we propose a novel method called multi-dimensional low-rank (MDLR) for HSI anomaly detection. It first reconstructs three background tensors separately from three directional slices of the background tensor. Then, weighted schatten p-norm minimization is employed to enforce the low-rank constraint on the background tensor, and L F , 1 -norm regularization is used to describe the sparsity in the anomaly tensor. Finally, a well-designed alternating direction method of multipliers (ADMM) is employed to effectively solve the optimization problem. Extensive experiments on four real-world datasets show that our approach outperforms existing anomaly detection methods in terms of accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. A new perspective on the Sullivan dictionary via Assouad type dimensions and spectra.
- Author
-
Fraser, Jonathan M. and Stuart, Liam
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *HYPERBOLIC groups , *FINITE groups , *FRACTAL dimensions - Abstract
The Sullivan dictionary provides a beautiful correspondence between Kleinian groups acting on hyperbolic space and rational maps of the extended complex plane. We focus on the setting of geometrically finite Kleinian groups with parabolic elements and parabolic rational maps. In this context an especially direct correspondence exists concerning the dimension theory of the associated limit sets and Julia sets. In recent work we established formulae for the Assouad type dimensions and spectra for these fractal sets and certain conformal measures they support. This allows a rather more nuanced comparison of the two families in the context of dimension. In this expository article we discuss how these results provide new entries in the Sullivan dictionary, as well as revealing striking differences between the two families. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Greedy versus optimal analysis of bounded size dictionary compression and on-the-fly distributed computing.
- Author
-
De Agostino, Sergio
- Subjects
- *
DISTRIBUTED computing , *ENCYCLOPEDIAS & dictionaries , *DATA compression , *DATA warehousing , *IMAGE compression , *DATA transmission systems - Abstract
Scalability and robustness are not an issue when compression is applied for massive data storage, in the context of distributed computing. Speeding up on-the-fly compression for data transmission is more controversial. In such case, a compression technique merging together an adaptive and a non-adaptive approach has to be considered. A practical implementation of LZW (Lempel, Ziv and Welch) compression, called LZC (C indicates the Unix command 'compress'), has this characteristic. The non-adaptive phases work with bounded size prefix dictionaries built by LZW factorizations during the adaptive ones. Both phases employ the greedy approach. We show a worst case analysis of the greedy approach with respect to the optimal solution decodable by the LZC decompressor and give worst case examples for which the solution cost produced by the distributed implementations reaches the Θ (d) theoretical upper bound to the optimal cost approximation factor, where d is the dictionary size. We prove that for the worst case examples of the totally adaptive version of LZW compression the approximation factor is Θ (d). Such analysis suggests that parallelization of on the fly compression is not suitable for highly disseminated data since the non-adaptive phases are too far from optimal. In order to improve the compression effectiveness, we suggest to apply LZMW (Lempel, Ziv, Miller and Wegman) factorization to LZC compression (LZCMW) during the adaptive phases. Such factorization was originally thought with a dictionary bounded by the least recently used strategy. We introduce LZCMW in order to have non-adaptive phases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. On the conception of proper names and deproprial expressions in the academic dictionary of contemporary Czech.
- Author
-
Georgievová, Jana, Křivan, Jan, Lišková, Michaela, and Šemelík, Martin
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *LEXICOGRAPHY , *MICROSTRUCTURE - Abstract
This paper is a conceptual supplement sui generis, the aim of which is to present a modified treatment of proper names and deproprial expressions in the Academic Dictionary of Contemporary Czech (ADCC). The focus of the study is both a reflection on the past and current lexicographical practice and a discussion of the key issues related to the treatment of the respective lexical subsystem in the dictionary. First, we summarize the basic facts concerning the lexicographic processing of proprial and deproprial lexical units in the field of explanatory lexicography. Second, we provide some more general information about the ADCC and, most importantly, about the macrostructure and microstructure of the dictionary in reference to the topic of the present study. We focus on the inclusion of proprial and deproprial entries in the ADCC and the specific treatment of proper names contained in phrasemes. Special attention is paid to the microstructure of proprial and deproprial entries, as well. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.